智能论文笔记

Output Mode Switching for Parallel Five-bar Manipulators Using a Graph-based Path Planner

Parker B. Edwards , Aravind Baskar , Caroline Hills , Mark Plecnik , Jonathan D. Hauenstein

分类：机器人

2022-09-22

平行操纵器的配置歧管比串行操纵器表现出更多的非线性。从定性上讲，它们可以看到额外的褶皱。通过将这种歧管投射到工程相关性的空间上，例如输出工作区或输入执行器空间，这些折叠式的边缘呈现出表现非滑动行为的边缘。例如，在五杆链接的全局工作空间边界内显示了几个局部工作空间边界，这些边界仅限于该机制的某些输出模式。当专门研究这些投影而不是配置歧管本身时，这种边界的存在在输入和输出投影中都表现出来。特别是，非对称平行操纵器的设计已被其输入和输出空间中的外来投影所困扰。在本文中，我们用半径图表示配置空间，然后通过使用同型延续来量化传输质量来解决每个边缘。然后，我们采用图路径计划器来近似于避免传输质量区域的配置点之间的大地测量。我们的方法会自动生成能够在非邻居输出模式之间过渡的路径，该运动涉及示波多个工作空间边界（局部，全局或两者）。我们将技术应用于两个非对称五杆示例，这些示例表明如何通过切换输出模式来选择工作空间的传输属性和其他特征。

translated by 谷歌翻译

Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations

Lianke Qin , Aravind Reddy , Zhao Song , Zhaozhuo Xu , Danyang Zhuo

分类：机器学习

2022-12-21

In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation. Given a data-set $X \subset \mathbb{R}^d$, a binary function $f:\mathbb{R}^d\times \mathbb{R}^d\to \mathbb{R}$, and a point $y \in \mathbb{R}^d$, the Pairwise Summation Estimate $\mathrm{PSE}_X(y) := \frac{1}{|X|} \sum_{x \in X} f(x,y)$. For any given data-set $X$, we need to design a data-structure such that given any query point $y \in \mathbb{R}^d$, the data-structure approximately estimates $\mathrm{PSE}_X(y)$ in time that is sub-linear in $|X|$. Prior works on this problem have focused exclusively on the case where the data-set is static, and the queries are independent. In this paper, we design a hashing-based PSE data-structure which works for the more practical \textit{dynamic} setting in which insertions, deletions, and replacements of points are allowed. Moreover, our proposed Adam-Hash is also robust to adaptive PSE queries, where an adversary can choose query $q_j \in \mathbb{R}^d$ depending on the output from previous queries $q_1, q_2, \dots, q_{j-1}$.

translated by 谷歌翻译

Shakebot: A Low-cost, Open-source Shake Table for Ground Motion Seismic Studies

Zhiang Chen , Devin Keating , Yash Shethwala , Aravind Adhith Pandian Saravanakumaran , Ramon Arrowsmith , Chris Madugo , Albert Kottke , Jnaneshwar Das

分类：机器人

2022-12-21

Our earlier research built a virtual shake robot in simulation to study the dynamics of precariously balanced rocks (PBR), which are negative indicators of earthquakes in nature. The simulation studies need validation through physical experiments. For this purpose, we developed Shakebot, a low-cost (under $2,000), open-source shake table to validate simulations of PBR dynamics and facilitate other ground motion experiments. The Shakebot is a custom one-dimensional prismatic robotic system with perception and motion software developed using the Robot Operating System (ROS). We adapted affordable and high-accuracy components from 3D printers, particularly a closed-loop stepper motor for actuation and a toothed belt for transmission. The stepper motor enables the bed to reach a maximum horizontal acceleration of 11.8 m/s^2 (1.2 g), and velocity of 0.5 m/s, when loaded with a 2 kg scale-model PBR. The perception system of the Shakebot consists of an accelerometer and a high frame-rate camera. By fusing camera-based displacements with acceleration measurements, the Shakebot is able to carry out accurate bed velocity estimation. The ROS-based perception and motion software simplifies the transition of code from our previous virtual shake robot to the physical Shakebot. The reuse of the control programs ensures that the implemented ground motions are consistent for both the simulation and physical experiments, which is critical to validate our simulation experiments.

translated by 谷歌翻译

Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks

Esaú Villatoro-Tello , Srikanth Madikeri , Juan Zuluaga-Gomez , Bidisha Sharma , Seyyed Saeed Sarfjoo , Iuliia Nigmatulina , Petr Motlicek , Alexei V. Ivanov , Aravind Ganapathiraju

分类：自然语言处理 | 人工智能

2022-12-16

In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable performance of different state-of-the-art SLU systems under different circumstances, e.g., automatically- vs. manually-generated transcripts. We evaluate the systems on the publicly available SLURP spoken language resource corpus. Our results indicate that using richer forms of Automatic Speech Recognition (ASR) outputs allows SLU systems to improve in comparison to the 1-best setup (4% relative improvement). However, crossmodal approaches, i.e., learning from acoustic and text embeddings, obtains performance similar to the oracle setup, and a relative improvement of 18% over the 1-best configuration. Thus, crossmodal architectures represent a good alternative to overcome the limitations of working purely automatically generated textual data.

translated by 谷歌翻译

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

Nicklas Hansen , Zhecheng Yuan , Yanjie Ze , Tongzhou Mu , Aravind Rajeswaran , Hao Su , Huazhe Xu , Xiaolong Wang

分类：机器学习 | 计算机视觉 | 机器人

2022-12-12

We revisit a simple Learning-from-Scratch baseline for visuo-motor control that uses data augmentation and a shallow ConvNet. We find that this baseline has competitive performance with recent methods that leverage frozen visual representations trained on large-scale vision datasets.

translated by 谷歌翻译

CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning

Zhao Mandi , Homanga Bharadhwaj , Vincent Moens , Shuran Song , Aravind Rajeswaran , Vikash Kumar

分类：机器人 | 人工智能 | 机器学习

2022-12-12

Developing robots that are capable of many skills and generalization to unseen scenarios requires progress on two fronts: efficient collection of large and diverse datasets, and training of high-capacity policies on the collected data. While large datasets have propelled progress in other fields like computer vision and natural language processing, collecting data of comparable scale is particularly challenging for physical systems like robotics. In this work, we propose a framework to bridge this gap and better scale up robot learning, under the lens of multi-task, multi-scene robot manipulation in kitchen environments. Our framework, named CACTI, has four stages that separately handle data collection, data augmentation, visual representation learning, and imitation policy training. In the CACTI framework, we highlight the benefit of adapting state-of-the-art models for image generation as part of the augmentation stage, and the significant improvement of training efficiency by using pretrained out-of-domain visual representations at the compression stage. Experimentally, we demonstrate that 1) on a real robot setup, CACTI enables efficient training of a single policy capable of 10 manipulation tasks involving kitchen objects, and robust to varying layouts of distractor objects; 2) in a simulated kitchen environment, CACTI trains a single policy on 18 semantic tasks across up to 50 layout variations per task. The simulation task benchmark and augmented datasets in both real and simulated environments will be released to facilitate future research.

translated by 谷歌翻译

MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations

Nicklas Hansen , Yixin Lin , Hao Su , Xiaolong Wang , Vikash Kumar , Aravind Rajeswaran

分类：机器学习 | 人工智能 | 机器人

2022-12-12

Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning -- policy pretraining, targeted exploration, and oversampling of demonstration data -- which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 150%-250% more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100K interaction steps, 5 demonstrations). Code and videos are available at: https://nicklashansen.github.io/modemrl

translated by 谷歌翻译

Multi-Label Chest X-Ray Classification via Deep Learning

Aravind Sasidharan Pillai

分类：计算机视觉

2022-11-27

In this era of pandemic, the future of healthcare industry has never been more exciting. Artificial intelligence and machine learning (AI & ML) present opportunities to develop solutions that cater for very specific needs within the industry. Deep learning in healthcare had become incredibly powerful for supporting clinics and in transforming patient care in general. Deep learning is increasingly being applied for the detection of clinically important features in the images beyond what can be perceived by the naked human eye. Chest X-ray images are one of the most common clinical method for diagnosing a number of diseases such as pneumonia, lung cancer and many other abnormalities like lesions and fractures. Proper diagnosis of a disease from X-ray images is often challenging task for even expert radiologists and there is a growing need for computerized support systems due to the large amount of information encoded in X-Ray images. The goal of this paper is to develop a lightweight solution to detect 14 different chest conditions from an X ray image. Given an X-ray image as input, our classifier outputs a label vector indicating which of 14 disease classes does the image fall into. Along with the image features, we are also going to use non-image features available in the data such as X-ray view type, age, gender etc. The original study conducted Stanford ML Group is our base line. Original study focuses on predicting 5 diseases. Our aim is to improve upon previous work, expand prediction to 14 diseases and provide insight for future chest radiography research.

translated by 谷歌翻译

Neural PDE Solvers for Irregular Domains

Biswajit Khara , Ethan Herron , Zhanhong Jiang , Aditya Balu , Chih-Hsuan Yang , Kumar Saurabh , Anushrut Jignasu , Soumik Sarkar , Chinmay Hegde , Adarsh Krishnamurthy

分类：机器学习

2022-11-07

Neural network-based approaches for solving partial differential equations (PDEs) have recently received special attention. However, the large majority of neural PDE solvers only apply to rectilinear domains, and do not systematically address the imposition of Dirichlet/Neumann boundary conditions over irregular domain boundaries. In this paper, we present a framework to neurally solve partial differential equations over domains with irregularly shaped (non-rectilinear) geometric boundaries. Our network takes in the shape of the domain as an input (represented using an unstructured point cloud, or any other parametric representation such as Non-Uniform Rational B-Splines) and is able to generalize to novel (unseen) irregular domains; the key technical ingredient to realizing this model is a novel approach for identifying the interior and exterior of the computational grid in a differentiable manner. We also perform a careful error analysis which reveals theoretical insights into several sources of error incurred in the model-building process. Finally, we showcase a wide variety of applications, along with favorable comparisons with ground truth solutions.

translated by 谷歌翻译

Cross apprenticeship learning framework: Properties and solution approaches

Ashwin Aravind , Debasish Chatterjee , Ashish Cherukuri

分类：机器学习

2022-09-06

学徒学习是一个框架，代理商使用专家提供的示例轨迹来学习在环境中执行给定任务的策略。在现实世界中，在学习任务相同的情况下，在系统动力学不同的不同环境中，人们可能可以访问专家轨迹。对于这种情况，可以定义两种类型的学习目标。一个在一个特定的环境中，当学习策略在所有环境中都表现良好时，该政策在一个特定的环境中表现良好。为了以原则性的方式平衡这两个目标，我们的工作介绍了交叉学徒学习（CAL）框架。这包括一个优化问题，要求寻求每个环境的最佳策略，同时确保所有政策保持彼此之间。优化问题中的一个调谐参数可以促进此临近。随着调整参数的变化，我们得出问题优化者的属性。由于该问题是非convex，因此我们提供凸外近似。最后，我们在大风的环境环境中的导航任务中演示了我们框架的属性。

translated by 谷歌翻译